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ABSTRACT 

The  image  transformation  due  to  camera  rotation  relative  to  a  stationary  scene  is 
analyzed,  and  the  associated  transformation  rules  of  “features”  given  by  weighted 
averaging  of  the  image  are  derived  by  considering  infinitesimal  generators  on  the  basis  of 
group  representation  theory.  Three  dimensional  vectors  and  tensors  are  reduced  to  two 
dimensional  invariants  on  the  image  plane  from  the  viewpoint  of  projective  geometry. 

Three  dimensional  invariants  and  camera  rotation  reconstruction  are  also  discussed.  The 
result  is  applied  to  the  shape  recognition  problem  when  camera  rotation  is  involjred.  rrion  i-or 
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1.  INTRODUCTION 


The  problem  we  consider  in  this  paper  is  as  follows.  Suppose  the  camera  is  rotated 
by  a  certain  angle  around  its  focus  relative  to  a  stationary  scene.  Then,  a  different  pro¬ 
jected  image  is  seen  on  the  image  plane.  However,  since  a  point  on  the  image  plane 
corresponds  to  a  ray  in  the  3D  scene,  occlusion  is  not  affected  by  camera  rotation.  If  the 
amount  of  camera  rotation  is  known,  the  original  image  can  be  recovered.  (Here,  we  do 
not  consider  the  effect  of  the  image  boundary.  We  assume  that  the  image  plane  is 
sufficiently  large  and  that  the  object  or  scene  of  interest  is  always  included  in  the  field  of 
view.)  This  means  that  the  information  content  of  the  image  is  not  affected  by  the  2D 
image  transformation  induced  by  the  camera  rotation. 

Suppose  the  viewed  image  is  characterized  by  a  finite  number  of  parameters  or 
features.  If  the  camera  is  rotated,  the  image  is  also  changed  so  that  the  features  change 
their  values.  If  the  set  of  features  is  invariant  in  the  sense  that  these  new  values  are 
completely  determined  by  the  original  values  and  the  amount  of  the  camera  rotation,  we 
can  predict  the  values  of  the  features  which  would  be  obtained  if  the  camera  were 
rotated  by  a  given  amount.  Conversely,  if  we  are  given  two  views  of  the  same  object 
obtained  from  different  camera  orientations,  we  can  reconstruct  the  amount  of  camera 
rotation  R  which  would  transform  the  values  of  the  features  to  prescribed  values.  An 
important  fact  is  that  in  this  process  we  need  not  know  the  point-to-point  correspon¬ 
dence.  All  computations  are  based  on  the  observed  features,  which  are  global  quantities. 

These  considerations  are  very  important  in  many  problems  of  computer  vision  and 
pattern  recognition  when  the  camera  orientation  is  controlled  by  a  computer.  Even  if 
the  camera  is  fixed,  various  types  of  analysis  of  the  image  become  easy  if  we  apply  to  the 
image  the  transformation  equivalent  to  camera  rotation.  This  technique  is  used  for  the 
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shape-from-texture  problem  by  Kanatani  and  Chou  [7]  and  for  the  interpretation  of 
lengths  and  angles  by  Kanatani  [6j.  A  similar  analysis  is  done  when  the  object  is  moving 
and  we  are  observing  the  optical  flow  (Kanatani  [5]).  In  this  paper,  we  will  discuss,  as  a 
typical  example,  the  center  of  gravity  and  principal  ax^s  of  a  given  region  to  see  how  the 
invariant  properties  can  be  utilized  to  recognize  the  shape  and  to  reconstruct  the  (actual 
or  hypothetical)  camera  rotation. 

2.  CAMERA  ROTATION  AND  INVARIANT  FEATURES 

Let  /  be  the  focal  length  of  the  camera.  The  camera  image  is  thought  of  as  the  pro¬ 
jection  onto  an  image  plane  located  at  distance  /  from  the  focus  O ;  a  point  P  in  the 
scene  is  projected  onto  the  intersection  of  the  image  plane  with  the  ray,  connecting  point 
P  and  the  focus  O.  Let  us  choose  an  ATZ-coordinate  system  such  that  the  focus  O  is 
at  the  origin  and  the  Z-axis  coincides  with  the  camera  optical  axis.  Choose  an  xy- 
coordinate  system  in  such  a  way  that  the  x-  and  y-axes  are  parallel  to  the  X-  and  Y- 
axes  with  (0,0,/)  as  the  origin.  This  zy-plane  plays  the  role  of  the  image  plane  (Fig.  1). 
A  point  (X,Y,Z)  in  the  scene  is  projected  onto  {x,y)  on  the  image  plane,  where 

x=fX/Z,  y~fY  /  Z.  (2.1) 

Consider  a  camera  rotation  around  its  focus  O  and  the  induced  transformation  of 
the  image  (Fig.  2).  Suppose  the  camera  is  rotated  by  rotation  matrix  R ,  which  is  an 
orthogonal  matrix,  i.e.,  RRT  =  /.  Then,  the  point  in  the  scene  which  was  seen  at  (x,y) 
now  moves  to  another  point  {x',y')  given  by  the  following  theorem. 

Theorem  1.  The  image  transformation  induced  by  camera  rotation  R=(rij)  is  given  by 
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.,_frn  x  +  r2iy  +  r3if  ,  ,r  i  i-r+r  22  i/  +  r  30/ 


r  13;r  +  r23l/  +  r  33/ 


,  !/'=/- 


r  i3X  +  ro3»/  +  r  33/ 


Proof.  A  rotation  of  the  camera  by  R  is  equivalent  to  the  rotation  of  the  scene  in  the 
opposite  sense.  If  the  scene  is  rotated  by  R~\—RT),  where  T  denotes  transpose,  point 
( X,Y,Z )  moves  to  point  ( X',Y',Z ')  where 
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(2.3) 


This  point  is  projected  to  (x',y')  on  the  image  plane,  where  x'=fX'  /  Z'  and  y'=fY' / Z' . 
Combining  this  with  eqns  (2.1),  we  obtain  eqn  (2.2). 


It  should  be  emphasized  that  the  image  transformation  due  to  camera  rotation  does 
not  require  any  knowledge  about  the  scene  and  that  the  transformation  has  an  inverse, 
which  is  obtained  by  interchanging  R  and  RT .  This  means  that  transformations  of  the 
form  of  eqn  (2.2),  which  form  a  subgroup  of  the  2D  projective  transformation  group ,  do 
not  alter  the  information  content  of  the  image  as  long  as  the  image  boundary  is  ignored. 
(In  this  paper,  we  always  regard  the  portion  of  the  image  near  the  boundary  as  unimpor¬ 
tant.)  In  the  following,  some  basic  results  from  projective  geometry  are  summarized  in  a 
way  that  is  convenient  in  our  consideration  of  the  image  plane  transformation. 

Suppose  the  image  is  characterized  by  a  finite  number  of  parameters 
i  =  l,2,  .  .  .  ,  N ,  which  we  call  features  of  the  image  (Amari  (1,  2]).  (They  are  called  pro¬ 
perties  in  Rosenfeld  and  Kak  [9].)  If  the  image  is  transformed  by  eqns  (2.2)  as  a  result  of 

camera  rotation  R,  these  features  take  different  values  J\,  i=l . N.  We  say  a  set 

of  features  Jj,  i  =  l,  .  .  .  ,  N  is  invariant  if  the  values  of  J\,  i= 1 . N,  are  deter¬ 

mined  by  the  values  of  Ji ,  i  =  l,  .  .  .  ,N  and  the  amount  of  camera  rotation  R  alone. 
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This  definition  suggests  that  an  invariant  set  of  features  is  describing  some  aspects  of 
the  image  that  are  “inherent  to  the  scene  itself”  and  are  independent  of  the  camera 
orientation  (Weyl  [15]). 

Let  Jit  i=l,  .  .  .  ,  N,  be  an  invariant  set  of  features.  We  say  the  set  is  reducible  if 
it  splits,  after  an  appropriate  rearrangement,  into  two  or  more  sets  of  features,  each  of 
which  is  itself  invariant  separately.  If  no  further  reduction  is  possible,  we  say  the  set  of 
features  is  irreducible.  This  definition  suggests  that  an  irreducible  invariant  set  of 
features  is  describing  a  “single”  characteristic  inherent  to  the  scene  while  a  reducible  set 
describes  two  or  more  different  characteristics  at  the  same  time  (Weyl  [15]). 

If  a  quantity  c  does  not  change  its  value  under  transformation  (2.2),  i.e. , 

c'=c,  (2.4) 


under  camera  rotation  R,  we  call  it  a  scalar.  Obviously,  a  scalar  is  itself  an  invariant 
and  is  irreducible.  Hence,  it  describes  a  characteristic  inherent  to  the  scene. 


If  a  pair  a ,  b  of  numbers  is  transformed  as  x,  y  of  transformation  (2.2),  i.e., 


rna  +  r21b+r31f 

I  '  — —  / 


f  \2&  22  6  +r  yJ 


r  13a  +r  036  +r3Af 


b'=f- 


r  i3a  4-r  036  +r 


(2.5) 


we  call  it  a  point.  Note  that  any  pair  of  numbers  can  be  interpreted  as  a  position  on  the 
image  plane.  However,  it  is  interpreted  as  indicating  a  position  in  the  scene  if  and  only 
if  it  is  transformed  as  a  point.  A  point  is  also  an  invariant  set  of  features  and  is  irredu¬ 
cible. 


A  line  on  the  image  plane  is  expressed  in  the  form 

Ax  +Btj  +  C  =0. 


(2.6) 


Here,  the  ratio  ABC  alone  has  a  geometrical  meaning;  A  ,  B ,  C  and  cA  ,  cB ,  cC  for  a 


non-zero  scalar  c  define  one  and  the  same  line.  In  order  to  emphasize  this  fact,  let  us 
write  A.B.C  to  express  a  line.  If  transformation  (2.2)  is  applied,  line  (2.6)  is  mapped 


A'x'+B'y'+C^  0. 


as  in  the  following  theorem. 


Theorem  2.  A  line  A.B.C  on  the  image  plane  is  transformed  by  earner  t  rotation  R 
into  the  line 

A  ':B':C  1 =r  jjA  -t-r *\B  -t-r 3jC  / f'.r  joA  +r .>*B  +r 33 C / f :/( r  13A  +  r.23£?)-l-r33C,.  (2.8) 

Proof.  In  view  of  eqns  (2.1),  eqn  (2.6)  is  written  as  A  (fX  /  Z)+B(fY  /  Z)+C  =0,  or 


[a  b  c  if ]  r  =0. 
Z 


From  eqn  (2.3),  we  find  that  A,  B,  C /f  are  transformed  as  a  vector,  i.e. , 


A'  1  r  A  1 

B'  =Rt  B 

c'/f]  Ic/fj 


(2.10) 


from  which  eqn  (2.8)  is  obtained. 


If  the  ratio  of  three  given  quantities  A.B.C  is  transformed  by  eqn  (2.8)  under 
came-a  rotation,  we  call  it  a  line  and  write  it  as  ABC.  It  is  an  invariant  set  of 
features  and  is  evidently  irreducible.  As  in  the  case  of  a  point,  any  triplet  of  numbers 
can  be  interpreted  as  a  line  on  the  image  plane,  but  it  is  interpreted  as  a  line  in  the 
seen'’  if  and  only  if  it  is  transformed  as  a  line. 

All  the  invariant  properties  considered  in  this  paper  are  invariant  with  respect  to 
the  “projective  transformations"  of  the  form  of  eqns  (2.2).  In  traditional  “projective 
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geometry”,  all  equations  are  written  in  terms  of  ‘‘homogeneous  coordinates”  defined  in  a 
"projective  space”  (cf.  Naeve  and  Eklundh  [8]).  If  we  regard  the  xxj- image  plane  (with 
the  “line  at  infinity”  added)  as  a  two-dimensional  projective  space  and  introduce  homo¬ 
geneous  coordinates,  eqns  (2.2)  are  rewritten  as  a  liner  transformation.  The  “point”  and 
“line”  defined  here  are  mutually  “dual”  and  expressed  exactly  dually  in  homogeneous 
coordinates. 

However,  the  purpose  of  this  paper  is  to  deal  with  applications  of  the  ideas  of  pro¬ 
jective  geometry,  and  in  dealing  with  real  images  the  xy -Cartesian  coordinate  system  is 
most  convenient.  Therefore,  in  the  following,  we  express  all  the  invariant  properties  in 
terms  of  the  xy-“inhomogeneous”  coordinates  of  the  image  plane.  The  aim  of  this  paper 
is  to  translate  the  results  known  in  projective  geometry  into  “manageable”  forms  and  to 
demonstrate  the  practical  use  of  this  type  of  knowledge. 

3.  IRREDUCIBLE  REDUCTION  OF  3D  VECTORS  AND  TENSORS 

Consider  three  quantities  a,  b  ,  c  which  are  transformed  as  a  3D  vector,  i.e. , 


for  camera  rotation  R.  (Note  that  the  rotation  matrix  R  is  tr  nsposed  because  we 
adopted  the  convention  that  R  is  the  amount  of  “camera  rotation”.)  This  is  an  invari¬ 
ant  set  of  features  but  is  not  irreducible  because 

Lemma  1.  If  a  ,  6 ,  c  are  transformed  as  a  3D  vector,  then  the  length  V a~+b~—c  ~  is  a 


There  are  two  ways,  mutually  dual,  to  interpret  a  3D  vector  a,b,c  as  irreducible 
sets  of  features.  One  way  is  to  regard  fa  c,  fb  c  as  a  point  and  the  length  y  a'2—b~-i-c~ 
as  its  intensity ,  which  is  a  scalar.  We  can  easily  check  from  Theorem  1  that 

Lemma  2.  If  a ,  6 ,  c  are  transformed  as  a  3D  vector,  then  fa/c  ,  fb  j c  are  transformed 
as  a  point. 

i  Hence  a  pair  fa/c,  fb  j  c  has  an  interpretation  as  a  point  invariant  on  the  image  plane  in 

the  sense  described  above.  Here,  we  allow  the  case  c=0,  regarding  it  as  a  point  located 
at  infinity.  We  also  make  the  convention  that  the  intensity  is  negative  if  c<0.  If  we 
imagine  that  the  3D  vector  ( a,b,c )  is  emanating  from  the  origin  O  (or  the  camera  focus) 
of  the  XYZ- coord  in  ate  system,  the  point  (fa  j  c,fb  /  c)  is  the  intersection  of  the  image 
plane  with  the  ray  defined  by  the  3D  vector  ( a,b,c ). 

|  Another  way  to  represent  a  3D  vector  on  the  image  plane  is  to  regard  a:b:fc  as  a 

line  and  the  length  Va'+h'+c2  as  its  intensity.  We  can  easily  check  from  Theorem  2 
■  that 

’  Lemma  3.  If  a,  b,  c  are  transformed  as  a  3D  vector,  then  a.b.fc  is  transformed  as  a 

,  line. 

I 

Hence,  equation  «r—  by—  fc=0  has  an  interpretation  as  a  line  invariant  on  the  image 
plane  in  the  sense  described  above.  If  we  imagine  that  the  3D  vector  (a,b,c)  is  emanat- 
I  mg  from  the  origin  O  (or  the  camera  focus)  of  the  AT’Z-eoordinate  system,  the  line 

m  -  by-  fr  0  is  the  intersection  of  the  image  plane  with  the  plane  passing  at  the  origin 
O  and  perpendicular  to  {a.b.r).  As  before,  we  allow  the  case  of  a  -b  (),  regarding  the 
line  as  located  at  infinity,  and  make  the  convention  that  the  intensity  is  negative  if 
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The  above  results  are  summarized  as  follows: 


Theorem  3.  A  3D  vector  is  an  invariant  feature  set.  It  can  be  irreduciblv  reduced  into 
a  point  and  a  scalar  or  into  a  line  and  a  scalar  on  the  image  plane. 


Next,  consider  nine  elements  A,y,  ij  =  1,2,3,  which  are  transformed  by  camera  rota¬ 
tion  R  as  a  3D  tensor ,  i.e. , 


A '11  A  12  A1 13 

4  /  4  /  4  / 

i  l  OJ  OO  4*1  03 

4  /  4  i  4  / 

•a  31  32  1  33 


A  n  A  12  A  i3 

A  21  A  22  A  23 

A  ?  i  A  ■)  o  A  >■> 


By  definition,  this  is  an  invariant  set  of  features.  However,  it  is  reducible.  First,  it  can 
be  decomposed  into  a  symmetric  part  and  an  antisymmetric  part  (or  skew  part): 


An  A  i2  A  i3 

r\  21  22  ^  23  == 

A  31  A  32  A  33 


An  (A  12+ A 2 1 ) / 2  (A31-l-A  i3)/2 

(A  i2+A  2 1 ) /  —  A  22  (A  23 -(-A  32)/2 

(A31  +  A  13)72  (A23+A32)/2  A  33 

0  (Ai2-A2i)/2  -(A3i-A13)/2 

-(A  12- A  21 )/~  0  (-4  23 -A  32)/2 

(A  31— A  13)/ 2  -(A23-A32V2  0 


and  each  part  is  transformed  as  a  3D  tensor  by  eqn  (3.2)  separately.  Moreover,  it  can  be 
verified  that  the  three  independent  elements  (A23-A32),  2,  (A3i-Ai3)/2,  (A  i2— A 21)'  2  of 
the  antisymmetric  part  are  transformed  as  a  3D  vector  Hence,  they  are,  from  Theorem 
3,  irreduciblv  reduced  into  a  point  and  a  scalar  or  into  a  line  and  a  scalar. 

Suppose  A  =(A1; )  is  already  a  symmetric  3D  tensor.  As  is  well  known,  such  a  ten¬ 
sor  is  represented  by  three  mutually  perpendicular  unit  vectors  e2,  e3  indicating  the 
principal  axes  and  the  corresponding  principal  values  nl.  rx.j.  rr3  in  the  form 


A=a1e1el  +cr.2e,e,  -(x3e3e3V  (3.4) 

Here,  this  representation  does  not  change  if  el  (or  e.,  or  e3)  is  replaced  by  -e,  (or  -e.2  or 
-e3).  (If  two  of  crx ,  <Jo,  <j3  are  identical,  the  corresponding  principal  axes  are  not  unique 
and  can  be  arbitrarily  rotated  rigidly  around  the  remaining  one.  If  all  of  <j,,  a->.  er3  are 
identical,  the  orientations  of  et,  e2.  e3  are  completely  arbitrary  as  long  as  they  are 
mutually  orthogonal.) 

The  three  principal  values  are  scalars,  each  of  which  is  an  invariant  irreducible 
feature.  On  the  other  hand,  if  we  determine  the  orientations  of  two  of  the  three  princi¬ 
pal  axis  orientations,  say  and  e._>,  the  orientation  of  the  remaining  one  is  uniquely 
determined.  (e3  and  -e3  indicate  the  same  orientation.)  As  is  shown  in  Theorem  3,  the 
orientations  of  and  e_»  are  represented  by  two  points  on  the  image  plane.  (If  we 
replace  e,  (or  e •_.)  by  e{  (or  -e->),  the  corresponding  points  are  unchanged  as  desired.) 
However,  since  ex  and  e2  are  perpendicular,  one  of  the  two  points  and  the  line  connect¬ 
ing  the  two  points  are  sufficient;  if  one  point  on  the  image  plane  and  a  line  through  it 
are  given,  the  three  orientations  are  determined  (Appendix  A).  Thus,  we  obtain 

Theorem  4  A  3D  tensor  is  invariantly  reduced  to  its  symmetric  part  and  its  antisym¬ 
metric  part.  The  antisymmetric  part  is  irreducibly  reduced  into  a  point  and  a  scalar  or 
a  line  and  a  scalar.  The  symmetric  part  is  irreducibly  reduced  to  three  scalars,  a  point 
and  a  line  through  it. 

4.  INFINITESIMAL  GENERATORS  OF  THE  IMAGE  TRANSFORMATION 

Let  F(x.ij)  represent  an  observed  image.  This  many  be  the  intensity  of  the  gray- 
level  or  a  vector-valued  function  corresponding  to  R.  B  and  G.  Here,  the  value  of  F(x.y) 


is  assumed  to  be  inherent  to  the  scene  and  independent  of  the  viewing  orientation. 
Color,  for  example,  has  this  property.  Furthermore,  F(x.y)  is  assumed  to  be  of  finite 
support,  i.e. ,  F(i.y)  is  zero  at  a  sufficiently  large  distance  from  the  origin  of  the  image 
plane. 

Let  us  write  the  transformation  of  eqn  (2.2).  which  is  determined  by  the  rotation 
matrix  R.  symbolically  as 

{x'.y^—XfR'Xx.y).  (4.1) 

Then,  we  can  see  the  (transposed)  homomorphism  in  the  sense  that 

AFlR*\oM\Rl\=\f[RlR2\.  (4.2) 

Now,  define  the  rotation  operator  T R  acting  on  image  F{x,y)  by 

TRF(x,y)~F(MRT](x,y)).  (4.3) 

In  view  of  our  assumptions  of  image  value  constancy  and  finite  support,  the  function 
TRF(x,y)  describes  the  image  we  observe  if  the  image  plane  undergoes  the  transforma¬ 
tion  (2.2).  Operator  TR  induces  a  representation  of  the  3D  rotation  group  SO{ 3)  in  the 
sense  that 

T  T  npT  Ri  (4.4) 

As  is  well  known,  this  representation  is  completely  determined  once  its  behavior  for 
infinitesimal  rotations  (i.e.,  its  Lie  algebra)  is  known,  since  50(3)  is  a  compact  Lie  group. 

A  3D  rotation  is  specified  by  the  rotation  axis  ( » \.n h3).  which  is  taken  to  be  a 
unit  vector,  and  the  rotation  angle  Q  (rad)  screwwise  around  it  As  is  well  known,  the 
corresponding  rotation  matrix  is  given  by 


R= 


cosfi+(l  -cosfi )  n  i 
(l-cosn)non1+sinn/i3 
(l-cosfi)rc3/i  i-sinfino 


(l-cosf2)n  1«2-sinnn3  (l-cosfi^i^n^sinfin.j 
cosfi  +( 1  —cosfi )  /i  2”  (l-cosn)«2n3-sinQ/i  ] 

(l-cosn)/i3/i2+sinQn  j  cosfi-f(l -cosfi)  ra3" 


(4.5) 


If  the  rotation  is  infinitesimally  small ,  i.e.,  fi  is  infinitesimally  small,  the  rotation  matrix 
takes  the  form  R=I+8R-\-o{ fi),  where  /  is  the  unit  matrix,  6R  is  the  matrix  given  by 


6R= 


0  -fi3  fi2 
fi3  0  -fi! 
— fio  fil  0 


(4.6) 


and  o(fi)  denotes  higher  order  terms  of  fi.  (We  let  the  context  indicate  whether  these 
terms  are  scalars,  vectors  or  tensors.)  Here,  we  put  fi1=fin1,  fi2=fin2  ancl  fi3=fin3. 

If  the  rotation  is  infinitesimal,  the  transformation  of  eqns  (2.2)  becomes 
x'=x+8x+o( fi)  and  y'=y-h8y  +  o(Q),  where 


8x  =-/Qo-hfl3y  +  -y(-^2x  +fii!/)^,  8y=/fll-Q3x  +-y-(-fi2J  +fij  y)y. 


(4.7) 


Then,  the  image  F(x,y)  also  undergoes  an  infinitesimal  change  and  becomes 


F(x-8x,y-8y)=F(x,y)+8F(x,y)+o(Q), 


(4.8) 


and  8F(x,y )  is  given  by 


rn,  N  OF  c  dFr 

8F(x,y)=-—6i-—8y 
ox  ay 


=-[-/Q.2~-n3y+^-{-n.2x+nly)x}^f—\fnl~n3x+h-nox+nly)y}^- 

J  dx  f  dy 

=-(fii D !+  Q-iD 2~rft3D s)F (z,y ), 


where  the  infinitesimal  generators  are  defined  by 


D‘=f-t+{,+i)w  d’=4-'£- 


(4.9) 
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(4.10) 


Hence,  operator  Tr  becomes,  for  infinitesimal  rotations, 


T  r —  /—  (f2jZ)  j-rQoZ?  o+QjDjjT  o  (Q),  (-4.11) 

where  /  is  the  identity  operator. 

It  can  be  checked  easily  that  these  infinitesimal  generators  satisfy  the  commutator 
relations 


jZ?i,Z?oj — D  3,  Z?o,Z?3j — Z?j,  iZ>3,Z>ij — D-2’ 


(-4.12) 


where  the  commutator  is  defined  by  [A  ,B\=AB-BA  .  Hence,  a  set  of  functions  can  be 
found  which  induces  a  representation  of  the  3D  rotation  group  50(3)  [3,  4], 

As  is  well  known,  a  set  of  functions  which  induces  an  irreducible  representation  is 
obtained  as  eigenfunctions  of  the  Casimir  operator 


//=-(D,2+D22+Z)  32). 


(4.13) 


The  eigenvalue  is  1(1  +  1)  and  the  eigenspace  is  2/  +  1  dimensional,  where  /  is  an  integer 
or  half-integer  called  the  weight  of  the  irreducible  representation  (cf.  Gel’fand,  et  al.  [3], 
Hammermesh  [4]).  In  other  words,  the  differential  equation 

IIF=l(l+l)F,  or  (Df+D-F+D  ^+[{1+^=0,  (4  14> 


has  2/  +  1  independent  solutions,  which  become  the  basis  of  the  irreducible  representation 
Di  of  weight  1  (Appendix  B). 


5.  ADJOINT  ROTATION  AND  FEATURE  TRANSFORMATION 

Let  J  be  a  feature  of  the  image.  To  be  precise,  a  feature  is  a  functional  mapping 
the  image  function  F(x,y)  into  a  real  number  J\F{x,g) j.  Consider  a  linear  feature 
obtained  by  weighted  averaging  or  filtering : 


J\F{x,  (/)]=/ m(x.y)F(x,y)dxdy. 


Here,  m(x,y)  is  the  filter  weight  function  and  integration  is  performed  over  the  entire 
image  plane.  (Recall  our  assumption  of  finite  support  of  F(.r,t/).)  If  the  camera  is 
rotated  by  R,  the  image  becomes  TRF(x,y)  by  eqn  (4.2)  and  hence  the  corresponding 


feature  becomes 


J\T  RF(x,y)\=j m  (x,y )  T RF(x,y  )dxdy.  (5.2) 

We  define  the  adjoint  rotation  operator  TR  by 

J[TRF(x,y)\=j  T Rm (x,y)F{x,y  )dxdy.  (5.3) 

From  this  definition,  we  can  see  that  operator  TR  induces  an  adjoint  representation  of 
the  3D  rotation  group  in  the  sense  that 

^,W2/?,===r«1°^ff2  (5.4) 

Once  we  know  how  this  adjoint  rotation  operator  TR  acts,  the  transformation  of  such 
features  is  immediately  computed  for  any  given  image.  This  is  done  by  just  considering 


infinitesimal  transformations. 


If  the  image  is  infinitesimally  changed  as  in  eqn  (4.8),  feature  J  also  undergoes  an 
infinitesimally  small  change  /  —*J +6J +o(n).  Substitution  of  eqn  (4.9)  and  integration 


by  parts  yield 


8J— J  (fi,Z?  j  Z  +Q3D  £)m  (x,y)F  (x,y)dxdy, 


where  D[,  D o  and  D 3  are  the  adjoint  infinitesimal  generators  defined  by 

D.  a +U+J£)B  D:=_*i_(f+ii)aMj>, 

f  f  dx  f  dy  J  j  dx  f  dy 


In  eqn  (5.5),  no  boundary  terms  appear  due  to  our  assumption  of  finite  support  for 
!  F{x ,y).  Hence,  operator  TR  becomes,  for  infinitesimal  rotations, 

i 

\ 

T  R  =/4-f21Z?i  -i-D3Z?3  +o(fi).  (5.7) 

i 

! 

|  It  can  be  checked  easily  that  these  adjoint  infinitesimal  generators  satisfy  the  com- 

mutator  relations 

[D[,Dt}=Dl  \DlDl\=D[,  \DID[\=DI  (5.8) 

Hence,  we  can  find  a  set  of  functions  which  induces  a  representation  of  the  3D  rotation 

. 

group  SO(3).  Then,  operator  TR  acts  as  a  linear  transformation  on  them  (cf.  Gel’fand, 
et  al.  [3]).  As  before,  a  basis  of  the  irreducible  representation  Dt  of  weight  /  is  obtained 
as  2/  +  1  eigenfunctions  of  the  (adjoint)  Casimir  operator 

H'=-{D{-+D?+Dl\  (5.9) 

i.e.,  as  2/  +  1  independent  solutions  of  the  differential  equation 

//*  m  =/(/  +  l)m,  or  {D  ,12+Dp+Dp)m  +l{l  +  l)m  =0.  (5.10) 


From  Appendix  C,  we  find  that 


j_  f  F(x,y)dxdy 


(5.11) 


is  an  invariant  (i.e.,  it  is  transformed  as  a  scalar).  This  implies  that 


p{x,y)= 


A**W+f-)3 


(5.12) 


is  an  invariant  measure  (Appendix  D). 


We  also  see  from  Appendix  C  that 


/  *»  •» 

(x*+r+/‘) 


(x2+r+/-) 


V(x2+y2+/2) 


are  transformed  as  a  3D  (symmetric)  tensor.  Hence,  they  are  irreducibly  reduced  to 
three  scalars,  a  point  and  a  line  through  it  on  the  image  plane.  They  are  invariant  and 
describe  characteristics  inherent  to  the  scene. 


6.  INVARIANT  CHARACTERIZATION  OF  A  SHAPE 

As  an  application  of  the  results  in  the  previous  sections,  let  us  consider  the  charac¬ 
terization  of  a  shape  on  the  image  plane.  Consider  a  region  5  on  the  image  plane.  Its 
characteristic  function 


(1 

F(^H  0 


if(x,*/)eS 

otherwise 


is  taken  as  the  image  function  F(i,y). 


The  simplest  characteristic  of  the  region  5  may  be  its  area 


5=fsdxdy(=J  F(x,y)dxdy).  (6.2) 

However,  this  area  is  not  invariant  with  respect  to  camera  rotation.  Suppose  the  region 
5  is  located  far  away  from  the  image  origin.  If  we  move  it  so  that  it  comes  to  the  center 
of  the  image  plane  by  appropriately  controlling  the  camera  orientation,  the  area  of  eqn 
(6.2)  changes.  Consequently,  eqn  (6.2)  is  not  considered  to  be  a  characteristic  inherent 
to  the  scene  itself.  In  short,  eqn  (6.2)  is  not  a  scalar. 

On  the  other  hand,  if  eqn  (6.2)  is  rep'aced  by 


dxdy 


v/(*2+r+/2)3 


(6.3) 


this  is  a  scalar  as  was  shown  in  the  previous  section.  If  5  is  a  small  region  located 
around  the  image  origin,  i.e. ,  j^O,  y^O  in  S,  then  C  is  approximately  equal  to  its  area. 
We  call  C  the  invariant  area  of  region  5.  It  is  interpreted  as  the  area  the  region  would 
have  if  the  region  were  moved  to  the  center  of  the  image  plane  by  changing  the  camera 
orientation.  Geometrically  speaking,  this  quantity  is  nothing  but  an  expression  of  the 
solid  angle  the  object  makes  with  respect  to  the  viewer. 

Another  simple  but  important  characteristic  is  the  center  of  gravity  of  the  region 
5: 


x=Jsxdxdy/Jsdxdy,  y  =  Jsydxdy  J^dxdy.  (6.4) 

Again  th°se  quantities  do  not  have  invariant  meanings.  Namely,  if  region  5  is  moved  to 
another  region  by  camera  rotation  and  (x',y')  is  its  center  of  gravity,  (7,7/)  is  not 
mapped  into  ( x',y' )  by  the  same  camera  rotation.  In  short,  7,  y  is  not  a  point. 


On  the  other  hand,  we  know  from  the  previous  section  that 


L*\ 

m 

i 


1  Jc  /  •'  •>  *OxO  ’  '  -  I  c  /  *»  •>  *0x0  °  ^  Jc  /  O  O  .0x0  Vw  -'/ 

5  (jr+y-+/-)-  J5  ( i-+y-+f -)-  Js  (x-+y-+f-)- 

are  transformed  as  a  3D  vector.  Hence,  /a1/'a3,  fa.2/a 3  are  transformed  as  a  point.  If 
the  region  S'  is  a  small  region  located  around  the  image  origin  and  in  S ,  then 

(/a  r  a3>/a2/  a3)  's  approximately  the  center  of  gravity  of  the  region.  We  call 
(/a,  a-iJa-n  a3)  the  invariant  center  of  gravity  of  region  S.  It  is  interpreted  as  the  point 
which  would  be  mapped  into  the  center  of  gravity  if  the  region  were  moved  to  the  center 
of  the  image  plane  by  changing  the  camera  orientation.  Geometrically,  this  point 
corresponds  to  the  center  of  the  solid  angle  the  object  makes  with  respect  to  the  viewer. 

Another  useful  characteristic  is  the  moment  tensor  (A tj  =1,2,  defined  by 

Mn=js(x-x)2dxdy,  Mn=M2x=fs{x-x){y-y)dxdy,  M 22= J $(y -y)2 dxdy .  (6.6) 

Its  principal  values  indicate  the  amount  of  elongation  of  the  region  5  along  the 
corresponding  principal  axes.  However,  as  described  above,  this  tensor  does  not  have 
invariant  properties.  Namely,  the  principal  values  of  (A/tJ)  are  not  scalars,  and  its  prin¬ 
cipal  axes  are  not  lines  on  the  image  plane. 

On  the  other  hand,  we  know  from  the  previous  section  that 


V 


» 


m 

•.it 


maximum  principal  value.  Let  e2,  e3  be  the  corresponding  unit  eigenvectors  (deter¬ 
mined  except  for  sign).  Let  (gx,g 2)  be  the  point  corresponding  to  vector  e3.  Let  /,  be 
the  line  through  {g\,g»)  and  the  point  corresponding  to  vector  ex  (or  the  line  represent¬ 
ing  vector  e2).  Similarly,  let  U  be  the  line  through  (gi,gn)  and  the  point  corresponding 
to  vector  e2  (or  the  line  representing  vector  ej).  By  our  method  of  construction,  scalars 
oq,  a*,  point  (gi,g~2)  and  lines  / lt  />  are  all  invariant  quantities.  It  can  be  checked  that 
lines  /j,  In  are  approximately  the  principal  axes,  and  oq,  cr2  are  approximately  the 
corresponding  principal  values  if  5  is  a  sufficiently  small  region  around  the  origin. 
Hence,  scalars  oq  and  cr2  are  the  principal  values  the  region  would  have  if  it  were  moved 
to  the  center  of  the  image  plane  by  camera  rotation,  and  / j,  /2  are  lines  which  would  be 
mapped  onto  the  principal  axes.  We  call  point  (<7lt<72)  the  invariant  center  of  i  .ertia, 
lines  fj,  In  the  invariant  principal  axes,  and  oq,  o2  the  corresponding  invariant  principal 
values. 


7.  INVARIANTS  AND  CAMERA  ROTATION  RECONSTRUCTION 

In  the  previous  section,  scalar  C  defined  by  eqn  (6.3),  3D  vector  a=(a,)  defined  by 
eqns  (6.5)  and  3D  tensor  B=(5i;  )  defined  by  eqns  (6.7)  are  interpreted  as  a  set  of  two 
dimensional  invariant  quantities  on  the  image  plane.  Here,  let  us  consider  their  three 
dimensional  aspects. 

First,  since  C ,  a  and  D  are  transformed  as  a  scalar,  a  vector  and  a  tensor,  respec¬ 
tively,  by  camera  rotation,  we  can  extract  invariants  that  do  not  change  their  values 
when  the  camera  is  rotated.  Obviously,  scalar  C  itself  is  an  invariant. 

Second,  since  a  is  a  3D  vector,  it  has,  as  was  discussed  in  Section  3,  only  one 
invariant,  namely  its  length  ||a||,  or  equivalently  aTa. 
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On  the  other  hand.  B  is  a  3D  symmetric  tensor,  and  hence  it  has,  as  was  described 
in  Section  3,  three  invariants,  namely  the  three  principal  values  alt  <x.>,  <t3,  or 
eouivalently  any  three  independent  algebraic  expressions  formed  from  them  such  as  the 
fundamental  symmetric  forms  (T\(T2~ai(T3~r’a3(Ji'  (Ti(T2(T3-  In  terms  of  the  com¬ 

ponents  of  the  original  tensor  B ,  they  are  respectively 


Bn 

B 12 

B  22 

B  23 

B  33 

B  31 

Bu+B.,2-rB3Z(=Tv(B)).  Bi 

B22 

B  32 

B  33 

B 13 

Bn 

Bn  Br2  B j3 

B 21  B‘22  B'2 3  (=detZ? ).  (7.1) 

#31  B  32  B  33 

Alternatively,  we  can  use  crf+af'+af'  and  <T13-4-crv>3-t-<T33.  This  set  is  equal  to 

Tr(B),  Tr(B2),  Tr  (B3).  (7.2) 

Finally,  there  are  invariants  describing  the  relationship  between  3D  vector  a  and 
3D  tensor  B .  As  was  discussed  in  Section  3,  a  3D  vector  is  geometrically  thought  of  as  a 
directed  axis  to  which  its  length  is  attached  and  a  3D  symmetric  tensor  as  three  mutu¬ 
ally  perpendicular  (undirected)  axes  to  which  their  respective  principal  values  are 
attached.  Now  that  the  length  and  the  principal  values  have  been  counted,  the  remain¬ 
ing  invariants  are  those  specifying  the  orientation  of  the  vector  relative  to  the  three 
mutually  perpendicular  axes.  Hence,  two  invariants  exist.  We  can  choose,  say,  aT  Ba 
and  a T  B'a  (Smith  10],  Spencer  Ill],  Wang  [12  -  1  -l] . )  Of  course,  the  choice  is  not 
unique  as  stated  above,  and  other  choices  are  also  possible. 

We  say  that  two  regions  5  and  S'  on  the  image  plane  are  equivalent  if  one  region 
can  be  transformed  into  the  other  by  camera  rotation,  i.e.,  by  changing  the  camera 
orientation.  If  the  two  regions  are  equivalent,  the  above  invariants  must  have  identical 


values.  If  they  ha'  e  different  valuer  the  two  regions  cannot  be  equivalent.  On  the 
other  hand,  if  the  two  regions  are  known  to  be  equivalent,  the  camera  rotation  which 
would  take  one  region  into  the  other  can  be  reconstructed  by  observing  the  invariant 


center  of  gravity  and  the  invariant  moment  tensor  alone.  This  is  done  as  follows. 

Suppose  we  observe  a  and  B  for  region  S  and  a'  and  D'  for  region  S'.  Assume 
that  B  (hence  B'  as  well)  has  three  distinct  eigenvalues  and  a^O.  Let  ex,  e2  and  e3  be 
the  associated  eigenvectors  of  B .  Since  the  eigenvectors  are  determined  except  for  sign 
and  magnitude,  choose  one  set  such  that  e,,  e2,  e3  are  mutually  perpendicular  unit  vec¬ 
tors  forming  a  right-hand  system  in  that  order.  Construct  a  matrix  Rx  having  ex,  e2,  e3 
as  its  columns  in  that  order.  Let  e\,  e2,  e3  be  the  corresponding  unit  eigenvectors  of  B' 
forming  a  right-hand  system.  Since  the  signs  of  the  eigenvectors  are  arbitrary,  there  are 
four  possibilities  to  make  a  right-hand  system.  For  each  case,  construct  the  correspond¬ 
ing  matrix  /?2.  Then,  the  rotation  matrix  which  transforms  B  to  B'  is  given  by 

R=RxRj .  (7.3) 

(Matrix  B  is  first  transformed  by  Rx~  (=RXT )  into  a  diagonal  matrix,  which  in  turn  is 
transformed  to  B1  by  Ro.)  Finally,  choose  one  our  of  those  eight  possible  Rs  that 
transforms  a  to  a'. 

If  B  (hence  B'  as  well)  has  only  two  distinct  eigenvalues  (a  single  root  and  a  pair  of 
multiple  roots),  let  ex  be  the  eigenvector  associated  with  the  single  root.  Suppose  a  is 
neither  parallel  nor  perpendicular  to  ex.  Since  the  sign  of  e,  is  arbitrary,  choose  it  so 
that  a  and  ex  make  an  acute  angle.  Then,  we  can  construct  three  mutually  orthogonal 
vectors  forming  a  right-hand  system  elt  e2=e  tX  a /||e  tX  a  ||,  e3=elXe2.  We  can  form 
/?,  and  R-2  as  described  above,  and  the  desired  rotation  is  given  by  eqn  (7.3).  If  a  is  per¬ 
pendicular  to  e  x,  there  exist  two  solutions.  If  a  is  parallel  to  ex,  or  if  B  (hence  B'  as 
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well)  has  one  eigenvalue  (i.e.,  B(=B')  is  a  multiple  of  I),  R  is  any  rotation  that  maps  a 
to  a'  and  we  can  add  any  rotation  around  a'.  The  case  where  a  =  a'=0  is  treated  simi¬ 
larly.  These  observations  can  be  summarized  as  follows: 

Theorem  5. 

C\  aT  a ,  Tr(5),  Tr(Z?2).  Tr(B3),  aT  Ba ,  aTB2a  (7.4) 

exhaust  all  the  invariants  constructed  from  C.  a  and  B  If  two  regions  are  equivalent, 
the  amount  of  camera  rotation  which  take  one  region  into  the  other  can  be  recon¬ 
structed  from  a  and  B  alone. 

An  important  fact  is  that  both  the  equivalence  test  and  the  camera  rotation  recon¬ 
struction  do  not  require  knowledge  of  point-to-point  correspondence,  since  the  computa¬ 
tion  is  solely  based  on  the  features  (6.3),  (6.5),  (6.6),  which  are  obtained  by  integration 
over  he  regions  under  consideration. 

Theoretically,  the  camera  rotation  is  exactly  reconstructed  as  described  above.  In 
practice,  however,  the  invariant  center  of  gravity  (/a  1/a3,/a2/a3)  and  the  invariant 
center  of  inertia  ( )  are  usually  located  very  near,  and  vector  a  and  vector  e3  are 
very  close  to  each  other.  Therefore,  the  last  step  of  choosing  one  out  of  four  possible  Rs 
by  checking  Ra  may  become  difficult  if  much  noise  is  involved.  In  this  case,  the  final 
choice  is  done  by  applying  the  transformation  (2.2)  to  region  S  in  four  ways  and  choos¬ 
ing  the  one  which  make  region  5  sufficiently  overlapping  S'.  (Since  we  are  focusing  on 
the  principal  axes,  the  four  possibilities  correspond  to  the  four  possible  (skewed)  “mi'  or 
image"  (including  identity)  with  respect  to  the  principal  axes.) 
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Example.  Consider  the  three  regions  S0,  S,.  So  on  the  image  plane  (Fig.  2(a)).  We  use 
a  scaling  such  that  the  focal  length  /  is  unity.  Computing  the  integrations  of  eqns  (6.5) 
and  (6.7).  we  find  their  invariant  centers  of  gravity  (Fig.  2(h))  and  principal  axes  (I  ig. 
2(c))  as  follows: 


(-0.081,-0.202)  (0.16-1.0  076)  (-0.-470,0.346) 

t/=-2.814z-0.431  y  ==1667  /  -0.697  i/=-0  079/ -0.310 

i/=0.3S2x -0.171  ij  =-0  176/  -0.297  y  =-16.522/ -7.42-1 


The  invariants  of  (7.4)  become  as  follows: 


C  0.1440  0.1440  0.1121 
aT  a  0.0202  0.0202  0.0123 
Tr(Z? )  0.1440  0.1440  0.1121 
Tr(C-)  0.0197  0.0197  0.0121 
Tr(Z?3)  0.0028  0.0028  0.0013 
aT Da  0.0028  0.0028  0.0014 
aTD~a  0.0004  0.0004  0.0001 


From  this  result,  we  can  conclude  that  regions  S0  and  Si  are  equivalent  but  region  S2  is 
not  equivalent  to  either.  (Here,  the  data  are  exact  up  to  rounding.  If  the  data  are 
affected  by  a  large  amount  of  error,  a  statistical  method  such  as  hypothesis  testing 
becomes  necessary.)  By  the  procedure  described  in  the  previous  section,  the  camera 
rotation  which  maps  region  S0  onto  region  5]  is  reconstructed  to  be 


"0.573  -0  761  -0  296' 
R=  0.-567  0  631  -0.530 

0.591  0  136  0  795 


This  is  the  rotation  around  the  axis  of  orientation  (0  384.  0  512.0.768)  by  angle  60* 
screw  wise 


8.  CONCLUDING  REMARKS 


In  this  paper,  we  have  presented  invariant  properties  of  an  image  with  respect  to 
camera  rotation,  introducing  the  notions  of  "invariance"  and  "irreducibility"  and 
translating  results  from  projective  geometry  in  terms  of  the  (inhomogeneous)  image  coor¬ 
dinate  system.  We  also  gave  an  example,  computing  the  invariant  center  of  gravity  and 
the  invariant  principal  axes  and  reconstructing  the  camera  rotation.  The  procedure  does 
not  require  the  knowledge  of  point-to-point  correspondence  on  the  image  plane.  Many 
other  applications  are  also  possible. 

Consider  the  problem  of  shape  recognition.  Suppose  we  have  a  reference  image 
obtained  from  a  certain  camera  orientation.  If  a  test  image  is  obtained  from  a  different 
camera  orientation,  the  two  images  cannot  be  compared  directly  due  to  projective  distor¬ 
tion.  However.  Theorem  5  provides  an  easy  test  for  their  equivalence.  Namely,  as  is 
also  shown  in  the  previous  example,  if  the  invariants  of  (7.1)  have  different  values,  the 
two  region  cannot  be  equivalent  and  the  test  shape  is  rejected. 

If  C .  a  and  D  alone  are  sufficient  to  characterize  the  set  of  test,  shapes  in  question 
completely,  the  equivalence  is  already  determined  at  this  stage.  Otherwise,  we  can  move 
the  test  shape  into  the  position  of  the  reference  shape  in  such  a  way  that  both  have  the 
same  a  and  D.  Then,  the  rest  of  the  shape  characteristics  arc  compared  to  test  for  the 
equivalence.  The  necessary  camera  rotation  is  reconstructed  as  described  in  the  previous 
section,  and  the  corresponding  image  transformation  is  performed  either  by  actually 
moving  the  camera  or  by  numerically  computing  the  image  transformation  (2.2) 

We  say  that  a  region  on  the  image  plane  is  in  the  standard  position,  if  l he  invariant 
center  of  inertia  center  coincides  with  the  origin  of  the  image  plane  and  the 

invariant  principal  axes  coincide  with  the  x-  and  y- axes.  Any  region  on  the  image  plane 


can  be  moved  into  the  standard  position  by  camera  rotation  R  such  that  (i|  B  is  diago¬ 


nalized  in  the  form 

rtbr= 

where  <r3  is  the  largest  principal  value  and  (ii)  if 

RTa^ 

then  <z 3 > 0 . 

Evidently,  shape  recognition  becomes  easier  if  the  test  shapes  are  always  moved 
into  the  standard  position  (either  by  actually  rotating  the  camera  or  by  computation). 
However,  this  technique  is  not  restricted  to  shape  recognition.  If  a  camera  is  tracking  a 
moving  object  while  the  camera  position  is  fixed,  or  if  a  camera  attached  to  a  robot  or 
an  autonomous  vehicle  is  aiming  at  a  fixed  object  in  the  stationary  scene,  the  technique 
described  above  can  be  used  so  that  the  object  in  question  is  always  seen  in  the  standard 
position . 

On  the  other  hand,  testing  the  equivalence  is  also  viewed  as  detecting  active 
motion.  When  an  object  image  moves  on  the  image  plane,  we  call  the  motion  passive  if 
that  motion  is  induced  by  camera  rotation  alone  and  active  otherwise  When  the  camera 
orientation  is  changed,  object  images  move  on  the  image  plane,  but  those  objects  may 
also  have  moved  in  the  scene  independently  of  the  camera.  According  to  the  procedure 
described  above,  we  can  detect  active  motion  even  if  the  angle  and  orientation  of  camera 
rotation  is  not  known  If  the  corresponding  two  object  images  are  not  equivalent,  l  lie 
object  must  have  moved  actively  If  they  are  equivalent,  the  object  has  not  moved  in 


vr 

a1.-, 


0  0 
0  er.i  0 

0  0  a3 


the  scene,  although  motion  is  observed  on  the  image  plane.  In  the  previous  example,  if 
three  regions  50,  5^  5o  are  images  of  the  same  object,  we  can  conclude  that  an  active 
motion  took  place  between  50  (or  5j)  and  So  while  no  such  motion  took  place  between 
ig  and  ^ 

Another  possible  application  is  camera  orientation  registration.  Even  if  the  camera 
is  rotated  by  an  unknown  angle  around  an  unknown  axis,  the  camera  orientation  can  be 
determined  as  long  as  one  particular  region  corresponding  to  a  stationary  object  is 
identified  on  the  image  plane  before  and  after  the  camera  rotation.  Thus,  the  principle 
we  have  described  has  a  wide  range  of  applications  to  many  problems. 
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APPENDIX  A  RECIPROCITY  AND  CONJUGACY 


Consider  a  line  l  on  the  image  plane  which  does  not  pass  through  the  origin.  Let 


x  cosO+ysinO—d 


(A.l) 


(d  >0)  be  its  equation.  We  say  that  point 


r-  (- 

P(-- —-cosO,-- sin0) 
d  d 


(A  2) 


is  reciprocal  to  line  l  with  respect  to  the  origin.  Conversely,  line  /  is  said  to  be  recipro- 
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cal  to  point  P  with  respect  to  the  origin.  In  other  words,  if  we  draw  a  line  through  the 
origin  and  perpendicular  to  line  /,  and  if  d  is  the  distance  between  the  origin  and  line  /, 
the  reciprocal  point  P  is  located  on  the  other  side  of  the  perpendicular  line  and  at  dis¬ 
tance  f~/d  from  the  origin  (Fig  Al).  If  d=  0,  point  P  is  interpreted  as  located  at 
infinity  (at  (cos0,sin0,O)  in  homogeneous  coordinates),  and  similarly  the  line  at  infinity  is 
regarded  as  the  reciprocal  line  of  the  origin  O . 

Consider  a  line  /  and  a  point  P  on  it  on  the  image  plane.  Let  H  be  the  foot  of  the 
perpendicular  line  drawn  from  the  origin  to  line  /,  and  let  d  be  the  distance  between 
point  P  and  point  H .  Consider  a  point  Q  on  the  other  side  of  line  l  at  distance  f~/d 
from  point  H  (Fig.  A2).  We  say  that  point  Q  is  conjugate  to  point  P  on  line  /  and  con¬ 
versely  point  P  is  conjugate  to  point  Q  on  line  /.  If  d=0,  Q  is  regarded  as  located  at 
infinity. 

As  stated  in  Theorem  3,  a  3D  vector  is  represented  as  a  point  or  as  a  line  on  the 
image  plane.  By  definition,  the  point  and  the  line  are  easily  shown  to  be  mutually 
reciprocal.  Hence,  if  one  is  known,  the  other  is  obtained  immediately 

As  stated  in  Theorem  4,  a  3D  symmetric  tensor  is  represented  by  three  scalars,  a 
point  and  a  line  through  it.  Let  et,  e2,  e3  be  the  unit  vectors  of  the  principal  axes 
(determined  up  to  sign).  Let  P x,  P 2,  P3  be  the  points  corresponding  to  them,  and  let  /, 
be  the  line  connecting  points  P2  and  P3,  /•»  the  line  connecting  points  P3  and  Pt,  and  /3 
the  line  connecting  points  Px  and  P2  (Fig.  A3).  Then,  it  is  easy  to  see  that  point  P ,  and 
line  lx  are  mutually  reciprocal,  and  so  are  point  P2  and  line  /2,  and  point  P3  and  line  l3. 
It  is  also  seen  that  points  P2  and  P3,  points  P3,  P x,  and  points  P,  and  P2  are  conjugate 
on  lines  lh  /2  and  /3,  respectively.  Hence,  if  point  P3  and  line  lx  are  given,  line  l3  and 
point  P,  are  obtained  as  their  reciprocals  Point  P2  is  given  as  the  intersection  between 
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lines  / j  and  l3,  and  line  U  is  given  as  the  line  connecting  points  P3  and  Pl.  Thus,  a 
point  and  a  line  passing  through  it  are  sufficient  to  represent  on  the  image  plane  the 
orientations  of  the  three  principal  axes  of  a  3D  symmetric  tensor. 


APPENDIX  B  FUNCTION  BASIS  OF  IRREDUCIBLE  REPRESENTATIONS 


From  eqns  (-4.10),  the  Casimir  operator  becomes 


//=-(/+£!±r  )!(/+il)^L+ Ixy  al_+(/+^ 

/  /  dx~  J  dxdy  f  dy~ 


f-  ox  f-  dy 


so  that  eqn  (4.14)  becomes 


(/+£^iC)[(/+i_)FII+2£I'FItf+(/+-^)Fyy] 


+(/+x 4 -M£±£l)Fz  +(/+*  +  -2li£f.+y3)  )Fv+l(l  +  l)F=0. 


(B.l) 


(B.2) 


Since  representations  of  half-integer  weights  are  not  interesting  because  the  same 
image  must  be  obtained  after  a  rotation  of  27r  (the  sign  is  reversed  after  a  rotation  of  27t 
if  the  weight  is  a  half-integer),  we  consider  only  irreducible  representations  of  integer 
weights. 

For  1=0  (/(/ -t-l)=0),  one  solution  (2/  +  l  =  l)  is  easily  found: 

Fo1(*.!/)=1-  (B.3) 

Obviously,  this  is  invariant  with  respect  to  rotation: 

D  iF0l=0,  D2F0'=O,  D3F0'=0,  (B.4) 


and  hence 


For  1=2  (/(/  — 1)=6),  the  lollowing  live  solutions  (2/  —1=5)  are  found 


~</  -r/  )  3(  J"— 1/  -/") 

F,3(z,y)=  —  F24(j.i/)=—  ^ ,  F.,b(i.y  )=— - 

x~+y-+f~  x-^y-+f-  x- 


Application  of  the  infinitesimal  generators  D{,  D»,  D3  yields 


>./ 

>  o1' 

'f,1' 

>./ 

>/ 

Fo1 

Fo'2 

F  <? 

F  0” 

Fo2 

Fo” 

Fo2 

D , 

f.,3 

f24 

=-Ai 

f23 

f24 

,  d2 

F23 

f24 

=-A2 

f23 

Fo4 

-  03 

Fo3 

F  o4 

II 

1 

CO 

Fo3 

f24 

F25 

.  " 

F  J1 

f2s 

Fo5 

Fo5 

Fo3 

where 


2 

_o 

-1 

to 

II 

1 

>  ^  3 — 

1 

-2  -1 

12 

-1 

-  .j 

and  the  commutator  relations  are  satisfied: 


[^1.^2]— -^3'  2>^  3]  ^  1’  [^3-^ll— ^2- 


Consequently,  for  infinitesimal  rotations,  we  have 


Sfg 


functions  Fo1,  F-r,  Fo3,  Fo4,  F2°  are  transformed  as 


^11  ^12  ^13  ^11  ^12  ^  13 

T  F21  Foo  F03  =  /?  Fo  j  Foo  Fo3  /?r 

^31  ^32  ^33  ^31  ^32  ^33 


( B.  18) 


Solutions  for  /=3,4,  •  •  -  are  constructed  similarly.  In  fact,  functions  F/1.  F,". 
F;'^1  are  just  the  /-th  spherical  harmonics  projected  onto  the  image  .ry-plane. 


APPENDIX  C  FEATURE  BASIS  OF  IRREDUCIBLE  REPRESENTATIONS 

From  eqns  (5.6),  the  Casimir  operator  becomes 

/  /  dx~  J  dxdy  f  dy-  dx  dy 

_6_22(£!+rI ,  (c ,) 

so  that  eqn  (5.10)  becomes 

(f +-j-)miz  +^JLmzy  +{f+-j-)myy  +9xmx  +8« priy  | 

+  [/(/  +  l)+6+  12(y+y~)  | m  =0.  (C.2) 


For  /  =0  (/(/^1)=0),  the  following  solution  (2/  +  l  =  l)  is  found: 


Application  of  the  infinitesimal  generators  D[,  Do,  D3  yields 


D[m  o'=0,  D-Un  Ql=0,  D3rn0l— 0, 


and  consequently  m0l  is  invariant  for  Tj{ 


«.  *  «  -  m  •  .  •  . 
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Feature  J  of  eqn  (5.11)  is  obtained  by 


•/  =  /  m0l{x,y)F(x,y)dxdy. 


From  eqn  (C.5),  this  is  a  scalar. 

For  /  =  1  ( / ( / 1  )=2) ,  the  following  three  solutions  (2/  +  l=3)  are  found 


mi  2(*.y)=7“o — 'i  7vT- 
(z -+(/-+/-)-  (x~+y~+f~)~ 


m  \{XdJ  )' 


f 


,  O  O  -On0 

(a;-+y-+/-)* 

Application  of  the  infinitesimal  generators  D[ ,  Do,  D3  yields 


“  — 

“  — 

—  — 

1 

mi 

m  j1 

m,1 

mj1 

D[ 

O 

m  f 

=~A[ 

O 

mr 

-  d: 

O 

m\- 

=-Ao 

O 

m 

™  3 

„  3 

3 

3 

771  j 

Ttl  j 

mi 

mj 

-  - 

-  — 

—  - 

-  - 

-  - 

r-  — 

m  i‘ 

1 

m  1 

Dl 

0 

m 

11 

1 

w  * 

0 

mf 

3 

3 

mi 

mi 

_  _ 

where 


r 

-1 

A‘i  = 

-1 

,  A  0  = 

,  ^3  = 

1 

1 

and  the  commutator  relations  are  satisfied: 


At.Ao' — A3.  [Ao,/l3j —  Af,  [A3,A*j— Ao. 


Consequently,  for  infinitesimal  rotations,  we  have 


m 


and  the  commutator  relations  are  satisfied: 

\A  1*  2  ]  3  ,  \A*2,Al  }=A  !*,  [A  3  ,A  *]=j4  o- 

Consequently,  for  infinitesimal  rotations,  we  have 


mo1 

'  r 

772  o 

O 

mo~ 

0 

772  0“ 

m> 

=  /—  (fi  j/1  ^  2  3  ) 

m23 

m  24 

m24 

m25 

m25 

This  implies  that  if  we  put 


mn=m21>  m22=m22,  m33=-m21-m.,2j 

m12=m21=m23,  w23=m32=m2'1,  m31=m  13=m25, 


functions  m2\  m2",  m23,  m24,  m25  are  transformed  as 


"‘  11  "l12  "*13 


"‘31  "*32  "‘33 


T#  m2 1  m22  rn23  |  =RT  I  m21  m22  m23  /?. 


mll  "*12  m13 


m3i  m32  m33 


Features  ,  ij  =  1.2.3,  of  eqns  (6.7)  are  obtained  by 

•4  =/ ( "‘0  ( -r.  >J ) +  j  "‘  o*(  1/  ( ar,  V ) didy, 


(C.17) 


(C.18) 


(C.19) 


(C.20) 


(C.21) 


where  6ty  is  the  Kronecker  delta.  From  eqns  (C.5)  and  (C.20),  they  are  transformed  as  a 
3D  tensor. 


Solutions  for  /  =3,4,  ■  •  ■  are  constructed  similarly.  In  fact,  we  can  check,  by  sub¬ 


stitution,  that  the  solution  is  given  by 


m'Uy)-- 


FSiz.y) 


vW+Z2)3 


I  =1,2,  .  .  .  .2/4-1. 


(C.22) 


APPENDIX  D  INVARIANT  MEASURE 

We  say  that  p(x,y)dxdy  is  an  invariant  measure  if  for  any  image  F(x,y) 

J  TRF(x,y)p(x,y)dxdy=f  F(x,y)p(x,y)dxdy.  (D.l) 

In  view  of  eqns  (5.1)  and  (5.3),  this  is  equivalent  to 

TRp{x,y)=p(x,y).  (D.2) 

Hence,  p  is  given  by  the  solution  of  eqn  (5.10)  with  1=0.  From  eqn  (C.3)  of  Appendix 
C,  we  obtain 


p(*-y)= 


1 


(D  3) 


This  result  can  be  interpreted  intuitively  in  terms  of  fluid  dynamics.  Suppose  the 
camera  is  rotating  with  rotation  velocity  (oq,o Jo,^),  namely  rotating  around  an  axis  of 
orientation  («,’!, u/2, <^3)  with  angular  velocity  \J (rad/sec)  screwwise.  (Here, 
wo.  w'3  are  also  interpreted  as  instantaneous  angular  velocities  around  the  x the  y 
the  3-axis,  respectively.)  The  optical  flow  induced  on  the  image  plane  is  obtained  by 
dividing  both  sides  of  eqns  (4.7)  by  6t : 


u - f  “ ( —uSnX  -t-u j^y  )x,  v  — fu)\—*jj^x  ■+*  -y(— umx  -t-uq y  )f/. 


(D.l) 


If  this  flow  is  regarded  as  a  fluid  flow  with  density  p{x,y),  the  necessary  and  sufficient 
condition  that  the  fluid  is  neither  created  or  annihilated  in  the  course  of  flowing  is,  as  is 


well  known,  given  by  the  equation  of  continuity 


if£!ii-ik£l=o.  (os) 

ax  ay 

If  eqns  (D.4)  are  substituted,  eqn  (D.5)  becomes 

(oq.D  i  -rto.iZ) o  +UJ3Z? 3  )p=0.  (D.6) 

This  equation  must  be  satisfied  for  arbitrary  uq,  u/3.  Hence,  the  invariant  measure 
p{x,y)  is  given  as  a  solution  of  the  differential  equations 

D[p= 0,  D‘2p= 0,  Z>3 /»= 0.  (D.7) 
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FIGURES 


Fig.  1  The  .YVZ-coordinate  system  is  fixed  to  the  camera,  the  origin  O  being  the 
camera  focus.  The  image  plane  is  taken  to  be  Z—f.  where  /  is  the  camera 
focal  length.  A  point  (X,Y,Z)  in  the  scene  is  projected  onto  point  (x.y)  on 
the  image  plane. 

Fig.  2  (a)  Three  regions  S0.  Sy.  S2  to  be  tested  for  equivalence,  (b)  Computed 

invariant  centers  of  gravity  6'0l  G G2  of  regions  5n,  .Fj,  S2.  (c)  Computed 
invariant  principal  axes  of  regions  50,  Sy,  S2. 

Fig.  Al  Line  /:  x  cos#—  ys\nO=d  and  point  P  (-(/'/ d)cosO.-(f 2  ;d)sin0)  are  mutually 
reciprocal  with  respect  to  the  origin  0. 

Fig.  A2  Points  P ,  Q  on  line  t  are  mutually  conjugate  with  respect  to  the  foot  FI  of 
the  perpendicular  line  drawn  from  the  origin  O  to  line  /. 

Fig  Ai  Point  Px  and  line  /(,  point  P.,  and  line  t2,  point  P3  and  line  /3  are  mutually 
reciprocal,  and  points  P2  and  P3  on  line  ly,  points  P3  and  Py  on  line  l\, 
points  Px  and  P2  on  line  are  mutually  conjugate. 
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